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SUMMARY 



This report describes the experimental designs and analyses to be used in the 
four specific areas that were assigned to the NPS investigators. 

In addition to developing experimentation plans to meet specific goals of 
ACCAT, this report represents development of experimentation technology. This 
technology will itself undergo "testing" as future experiments are undertaken at 
ACCAT. Much of the methodology for the specific experiments described herein will 
prove useful in follow-on experiments. 

Due to normal problems associated with starting a project like ACCAT, there 
are limitations imposed on the real-world aspects of these initial experiments. 
Looking into the future, the results of what one learns in the early stages will 
affect subsequent research. It is necessary to get operationally oriented in 
the experiments as soon as possible. It is important for the ACCAT test bed to 
develop the capability for producing experimental resources such as scenarios, 
special software, etc., as the needs arise. 

Pilot trials are recommended in some sections to allow determination of ap- 
propriate levels of the experimentation factors. While this phase need not be 
part of the formal recorded experiments, it is important to document their outcome 
The NPS team will need at least 30 days for analysis of the data from the 
experiments. As of this writing, the final report is expected on 1 September 1977 
Therefore, the data should be delivered to us by 1 August 1977. 



I. DISPLAY EXPERIMENTS 



Executive Summary . 

This section describes those experiments which are planned to evaluate the 
transfer of information between the software/hardware components and the human 
operator/decision maker in the ACCAT testbed. A variety of experiments are 
proposed which will provide a comprehensive evaluation of the total information 
transfer system. The development of the methodologies and experimental con- 
cepts will provide an output which will form the basis of methodologies and 
concepts to be used in future experiments. 

It would not be feasible to try to evaluate, in one experiment, all the 
display variables which are of concern to the ACCAT team. A modular approach 
is therefore proposed in which a sequence of experiments will be carried out. 
For example, in the first experiment the total economics of a GENISCO color 
display will be one of the variables of interest (economics here refers not 
only to cost but also to the effect on operator information processing times, 
usability of the total system, etc.). The results of the first experiment 
with regard to this question, will then form a guideline for colors to be 
used in the display presentation in the next module (experiment) of the ex- 
perimentation process. The sequence of experiments will be phased such that 
results of previous experiments will form the basis of upcoming experiments 
needed to evaluate other variables in the display area. The longer goal is 
to tie the sequential experimental results together to form an effective basis 
for reaching design decisions ( specifications) on how to most effectively 
transfer information from the system to the operator/decision maker. 

Data from the experiments will be in the form of both subjective and ob- 
jective information, i.e. measurements of times, accuracies, opinions, etc. 
Proper statistical analyses of these dependent variables can then be performed 
which will give answers such as whether different conditions of a given 
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variable affect transfer of information to the operator, etc. In addition to 
gaining information about various independent variables, interactions between 
the variables can be assessed as well as the correlation between independent 
variables and human performance in the system. 

The following is a partial list of the variables which will be examined 
during the sequence of experiments: low resolution color displays versus high 

resolution black/white displays, mercator versus polar map projection, types 
of geographic locations and situations represented therein; function buttons 
versus keyboard entry; where to present alpha numeric information of various 
types - on status boards, on the CRT, etc.; NTDS symbols versus other designs; 
method of presenting ship tracks (bearing and speed); size of display screen 
really needed; zoom capability on maps; etc. 

A. DISPLAY EXPERIMENT I 

1. EXPERIMENT TITLE : Color Combination-Naval Situation 

2. OBJECTIVE : To make a comparison of the usability, economics and 

feasibility of various color display combinations versus the conventional black 
and white display, and to compare these under 24 various naval situations which 
vary in type of threat presented and number of enemy, friendly and neutral 
forces present. A second objective in this experiment is to evaluate the pre- 
sentation of newly acquired enemy information by displaying it in one of two 
modes, blinking or non-blinking- Measuresof effectiveness to evaluate these 
objectives will be operator proximity threat assessment time and accuracy, 
operator detection time for newly presented enemy symbols and operator sub- 
jective opinions. 

3. RESOURCES REQUIRED : 

a. ACCAT testbed facility 

b. 20 subjects - 3 hours each 

(2 subjects can participate in the experiment at the same time) 
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c. 24 Naval situations to be displayed 

d. 1 (one) experimenter 

e. 1 color display (GENISCO) 

f. Ability to record operator's proximity threat assessment time 

g. Ability to record operator's proximity threat assessment accuracy 

(i.e. the number of platforms he believes to be a threat. See 
details in 4.b. which follows) 

h. Ability to record operator's detection time of newly inserted enemy 
symbol 

i. Record of operator's subjective opinions to a series of questions 
presented on the display during his debriefing period 

j. All data should be provided on 9 track IBM magnetic tape or IBM 
cards for final evaluation 

4. GENERAL CONTEXT 

a. Concept and Need . 

One of the new technologies to be examined in the ACCAT testbed 
is the use of a multi-color CRT display. In order to evaluate 
its effectiveness in facilitating the transfer of information from 

the software/hardware system to the human operator, it is imperative 
that the displays experiments be undertaken in order to determine 
the most effective means for presenting the information to the 
operator . 

b. General Situation for the Experiment . 

This experiment is designed so that a subject will be presented 
with 24 naval situations, one at a time, and in each situation, a 
different color combination will be used to represent enemey and 
friendly forces, longitude and latitude lines, etc. A trial will 
consist of displaying to the subject one situation with a given 
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color combination. The operator's main task will be to make a 
proximity threat assessment of the situation. The experimental 
conceptual design is shown below in Figure 1. The 24 naval situa- 
tions displayed are part of the resources required and will be 
developed by the ACCAT project team. (See details in Section 6.b. - 
Comments and Special Instructions.) 



COLOR 

[BINATION 




Figure 1. CONCEPTUAL DESIGN 

Accordingly then, a given subject will be presented a display of 
a naval situation, he will make a proximity threat assessment of 
the situation and when finished, type an F (for finished) into 
the keyboard associated with the GENISCO display. When the situa- 
tion is presented on the display screen, a clock will start running 
and subsequently stop when the operator depresses the F key, thus 
obtaining a measure of assessment time. When the F key is 



depressed, it will not only stop the clock but also cause the fol- 
lowing question to appear on the display screen (or a side screen): 
"If you believe there was no threat in the display you 
just observed, type a zero (0) on the keyboard. If you 
believe there was a threat, enter on the keyboard the num- 
ber of enemy platforms which you considered to be a threat." 

(Subject types in a number.) 

If the subject enters 0 (zero) to the above question, he 

thinks no proximity threat exists. If he enters any other number, 
he thinks a threat exists. It is then possible to analyze if 

he was right or wrong, and if he was right, one can also tell if 
he was totally right by looking at his answer to the above question 

Once the subject has entered his response to the threat assess- 
ment question, this will trigger the next display situation to be 
presented. It will be necessary for experienced military per- 
sonnel of the ACCAT project team to determine if a proxmity threat 
does exist for each situation, and if so, how many enemy plat- 
forms are a threat. These answers can then be compared with each 
subject's answers. The exact distance for which a proximity threat 
exists will be determined in pilot trials described in Section 6.b. 

It is anticipated the proximity threat assessment will take 
around 30 seconds to 1 minute. Thus, in a given experimental 
session, a subject will receive eight (8) displays before having 
a rest period. Consequently, it is planned to run each subject 
in 3 experimental sessions of eight trials each (see typical 
schedule for subjects which is described later). 

To summarize the foregoing, subject 2 for example, will enter 
the experimental area, be indoctrinated and receive some practice. 
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and then be presented with his first sequence of 8 trials. Referring 
to Figure 1, the second trial for subject 2 might be a presentation 
of situation 23 under color combination 1 with a blinking enemy 
symbol which will be explained shortly. 

The specific color combinations referred to are illustrated in 
TABLE I. The monochromatic condition can be obtained by using 
the GENISCO in a black/white mode with no color being generated. 

In addition to the main proximity threat assessment task for the 
observer, the experiment is designed to investigate the presenta- 
tion of newly acquired enemy information in two different modes. 

At a random time within the first 5 seconds after a display 
situation (trial) begins, another new enemy symbol will appear on 
the display. It will appear on the display screen 1) as a new sym- 
bol being constantly displayed or 2) as a new symbol but which 
blinks on and off every .25 seconds. In other words, the new 
symbol will simply appear as a steady lighted symbol or it will be 
blinking. This is what is meant by STATUS OF THE ENEMY SYMBOL TO 
BE DETECTED shown in Figure 1. The subjects' task will be to make 
the proximity threat assessment as described earlier, and in addi- 
tion detect the newly presented enemy symbol. A clock will start 
when the enemy symbol appears and will be stopped by the subject when 
he has detected it by the depression of the letter D (for detect) 
on the keyboard. It is also planned to simply have the subject 
point to the symbol and the experimenter in the room will keep a 
record of whether the subject had identified the newly presented 
symbol or simply thought he had. This will be fairly obvious in the 
blinking condition but not in the steady state (no blink) condition. 
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NOTE: If many colors can be generated from the GENISCO display, then we would 

suggest the specific colors listed below: 



Color 


Angstroms 


Red 


6420 


Orange 


6100 


Amber 


5960 


Yellow 


5820 


Aqua 


5560 


Green 


5150 


B1 ue 


4760 


Purple 


4300 



These specific spectral colors are chosen because past research has shown 
them to be equally discriminable to the human observer. 

If Angstrom comparisons cannot be obtained or set on the GENISCO, then the 
next best choice would be to select color book numbers from the Munsell color 
system and try to match the display color to the Munsell color. The follow- 
ing Munsell book numbers are colors which are also equally discriminable to 
the observer in the Munsell system: 



Munsell Book Numbers: 3R, 9R, 9YR, 1GY , 3G, 7BG, 9B, 9PB, 3RP 



TABLE I. COLOR COMBINATION SPECIFICS 
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Thus, in addition to time and accuracy for the proximity threat 
assessment, detection time and accuracy will also be obtained for 
each situation (trial) presented. If the subject does not detect 
the newly presented enemy symbol, the depression of the F key 
should also stop the running of the detection time clock. 

It should be fairly obvious by now that a given trial has a 
subject looking at a given naval situation with a given color com- 
bination and the enemy symbol for that situation is presented in 
a blinking or no-blink mode, 
c. Subjects 

Subjects must be representative of the real world command control 
officer; hopefully the subjects would actually be some of those 
officers. That is, the subjects should have previous experience 
with the types of tasks required. 

Subjects should not be color blind (for proper evaluation of the 
color display). However, since we are really interested in whether 
or not the subject can distinguish the colors used on the current 
display, a simpler check will be to have the experimenter ask the 
subject to describe each color during the initial indoctrination 
and practice trials. If the subject identifies red as red, etc., 
the experimenter should note this or any discrepancies. 

Each subject will be assigned a random sequence of 24 trials 
(experimental conditions) which are described later in detail. 

(NOTE: Each subject performs in only 24 of the possible 192 condi- 

tions shown in Figure 1.) The 24 trials for each subject will 
actually be presented in 3 sessions of eight trialSjeach followed 
by a 15 minute break. 

Each subject should be asked to be available for about 3 hours. 
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It is proposed to run two subjects simultaneously in the following 
sequence where each of the following is 15 minutes in duration: 

I = Indoctrination 
P = Practice 

W = Work for eight trials 
R = Rest 
D = Debrief 

Typical Schedule for 2 Subjects: 

Subject 1: I-P-R-W-R-W-R-W-R-D 

Subject 2: I-P-R-W-R-W-R-W-R-D 

5. EVALUATION 

a. Data Collection 

It is extremely important that all data be correctly associated 
with the proper subject and trial number for that subject. 

As described in the previous sections, the following variables 

would be measured for each trial: 

1) Operator's assessment time 

2) Operator's entering of a number after he's done assessing 
to indicate the number of platforms he believes to be a 
threat in that trial 

3) Operator's detection time of newly acquired symbol 

In addition, the following is a list of subjective questions to 
be asked in the subject's debriefing period. The questions should 
be presented on the display and subject's answers recorded by his 
entering responses into the keyboard. 

DEBRIEF QUESTIONS : 

1 . What is your rank? 

(subject (S) types in rank) 
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2. What is your specialty? 

(S types in specialty) 

3. How many years have you been involved in command control 
operations? 

(S types in number of years) 

4. Have you ever served as a command control officer? 

(S types yes or no) 

5. Do you have any NTDS experience? 

(S types yes or no) 

6. Have you ever had any experience similar to the tasks you 
performed in this experiment? 

(S types yes or no) 

7. Did the blinking of the unknown enemy symbol help you to 
detect it easier than when it didn't blink? 

(S types yes or no) 

8. What do you feel is the maximum total number of friendly, 
enemy and neutral platforms an operator could handle on the display 
without being confused? 

(S types a number) 

9. Which colors do you prefer, A or B, for friendly-enemy colors? 

A) Friendly - Green 
Enemy - Red 

B) Friendly - Blue 
Enemy - Orange 

(S types A or B) 

10. Would you prefer to use (A) NTDS symbols or (B) symbols in 
the shape of an airplane to indicate an airplane, symbols in the 
shape of a ship to indicate a ship, etc.? 

(S enters A or B) 
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11. Do you prefer land masses to be displayed in A) Blue or B) 
Green? 

(S enters A or B) 

12. Do you prefer longitude and latitude grid lines to be displayed 
in: 



A) Purple 

B) Blue 

C) Green 

(S enters A or B or C) 

13. This question requires one slide to be presented. It is pre- 
sented 4 times and each of the color combinations used in the ex- 
periment is presented once (A, B, C, D) . The subject is asked to 

tell which color combination he prefers (monochromatic is one of 
the conditions). S should be allowed to switch back and forth 
among the four to make up his mind. 

(S enters A or B or C or D) 



In order to effectively assemble the above information and 
properly associate each piece of data with the proper subject 
and trial, the following example, TABLE II, is one suggested 
method for formatting the data. 
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EXAMPLE FORMAT 
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i, 12, A, B, B, B, C 
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— ^ 














Answers to debrief questions 





TABLE II. POSSIBLE FORMATTING SCHEME FOR DATA 
(Referenced to Figure 1) 
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b. Data Analysis 



Times to analyze and times to detect will be analyzed in the con- 
ceptual design shown in Figure 1. This is the type of design in 
which each subject performs in only 24 of the 192 experimental 
conditions. Analysis of variance techniques with a linear model 
incorporating selected two way interaction terms will be used. 

Since times to perform tasks are typically distributed as some 
member of the Gamma family, it is anticipated that a log trans- 
form of the assessment and detection times will be necessary to 
stabilize the variance. 

In addition to the analysis of variance techniques, multiple 
comparison techniques will be used to determine exact differences 
in significant main effects such as color combinations, situations 
and status of detected enemy symbol . 

Correlation and regression techniques will be used where needed 
for association between variables or for prediction. 

Debriefing responses will be tabulated and statistical sum- 
maries provided. Nonparametric correlation techniques will be 
used where possible on the debriefing response data, 
c. Anticipated Results 

The results should indicate whether there is a difference in the 
color combinations used, i.e. it might show that color combination 
3 is better than the others. The same should occur for situations, 
in that the analysis could indicate if certain types of situations 
presented cause a difference in the operator's performance. Like- 
wise, if there is a difference in the blink - no-blink conditions, 
this will be indicated. 

In addition, interactions between the variables may be significan 
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which might indicate that certain color combinations work best for 
certain naval situations and other color combinations work best 
for other situations. 

For the debriefing response, one may find a relationship between 
operator experience and performance, a relationship between color 
preferences for symbols and performance, etc. The subjective re- 
sponses may also provide insight into future symbol design, etc. 

It is also anticipated that the situations may be subjectively 
classified into three or more clutter levels and then performance 
analyzed with respect to clutter level also. 

6. COMMENTS AND SPECIAL INSTRUCTIONS 

a. Subjects' Experimentation Trial Sequences 

The following is a detailed outline of each experimental sequence 
for each subject. 

Table III assigns a number to every experimental cell (condition) 
in Figure 1. Table IV then shows the exact experimental cells of 
Table III in which a subject will perform. Remembering that each 
subject performs 3 sequences of eight trials each. Table IV gives 
3 sequences of cell numbers for each subject. In Table IV, each 
subject's sequential trials are listed in exact order. The sub- 
ject will take a 15 minute break after each sequence of 8 trials. 

Table V shows the sequences of Table IV in a different arrange- 
ment. Table V shows each subject who has been assigned to each 
cell of Table III. In Table V, the upper number in the cell is the 
subject number and the lower number in the cell is the trial num- 
ber for that subject. Therefore, Table IV and Table V can be 
cross referenced to verify a subject's sequential order of ex- 
perimental testing. Using 16 subjects providing 24 data points 
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each, the final design will include 2 data points in each cell of 
Table III. All subject assignments have been randomized and 
balanced where appropriate. 

It should be noted that there are experimental sequences for 
16 subjects. These 16 sequences must all be run in these orders 
to allow proper completion of the experimental design. In the 
section on resources required, 20 subjects were requested. If 
the first 16 subjects all provide data, one could stop data gather- 
ing.. However, if subject 3 for example, had breakdown of equip- 
ment, etc., during his trials, then it would be necessary that 
another subject be assigned to the exact same sequence which sub- 
ject 3 had. 

b. Pilot Trials and Development of Situations and Proximity Threat 
As mentioned earlier, the ACCAT project team will develop the 24 
naval situations to be displayed as part of the resources required 
to run the experiment. These situations will vary in 1) the num- 
ber of friendly, enemy and neutral platforms present, 2) the 
geographic location represented and 3) the location of the given 
platforms. See Figures 2 and 3 as examples. It will be necessary 
to run some initial pilot trials to ensure that the level of dif- 
ficulty represented on the plates spans the threshold range of the 
human operator, i.e. the pilot trials need to be done to make sure 
the situations are not all too easy or all too hard for an operator. 
If such were the case, no useful information would be gained as 
the subjects would all assess every situation very easily, or none 
of the subjects would be able to assess the situation if all situa- 
tions were too difficult. In this experiment, NTDS symbology will 
be used. 
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FIGURE 2. AN EXAMPLE OF ONE OF THE NAVAL SITUATIONS UNDER 
COLOR COMBINATION 2 WITH 3 FRIENDLY, 3 ENEMY 
AND 1 NEUTRAL PLATFORM. 
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FIGURE 3. AN EXAMPLE OF ONE OF THE NAVAL SITUATIONS UNDER 
COLOR COMBINATION 3, WITH 7 ENEMY, 7 FRIENDLY 
AND 3 NEUTRAL PLATFORMS. 



23 



In addition, the pilot trials need to be run to provide a 
general guideline for the distance criteria to be used in the 
proximity threat assessment. That is, will an enemy platform be 
a threat if it is within 300 miles or within 600 miles, etc. The 
pilot trials should provide a guideline for this criteria. All 
information from pilot trials should be written down or documented 
in some manner for the record by the ACCAT project team, 
c. Subject Indoctrination 

All subjects will be informed as to how to operate the equipment, 
what their task will be and what will be required of them. In 
addition, they will have an opportunity to practice after the in- 
doctrination, ask questions, etc. before they will be required to 
perform experimental trials. 

Another detail to include in the display presentation is when 
the enemy symbol is inserted to be detected, it should appear 
quietly and not cause the display to jump or flash and thus signal 
the operator that the new symbol has been inserted somewhere. 

It should also be noted that the experimenter has two tasks 
to manually record if they are not automated. One is the experi- 
menter will have to determine if each subject interprets the colors 
used as red being red, etc., and note any deviations. Secondly, 
the experimenter will have to keep a running check on whether the 
subject actually detected the newly inserted enemy symbol, or whethe 
the subject simply depressed the D key and pointed to another 
symbol which was incorrect. 

The pilot trials will also help verify separations to be used on 
latitude-longitude grid lines. It is presently felt that displays 
of 600 miles on a side should show grid lines every 5 degrees apart. 
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Large or smaller displays should show grid lines separated in a similar 
proportion. 
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B. DISPLAY EXPERIMENT II 



1. EXPERIMENT TITLE : Resolution and Type of Symbols 

2. OBJECTIVE : To compare, under a given task, NTDS symbols with other 

types of symbols in 24 different naval situations under three display conditions: 

1) the best color combination from Display Experiment I, 2) Black and White high 
resolution, 3) Black and White low resolution. 

3. RESOURCES REQUIRED : 

a. ACCAT Testbed 

b. 24 naval situation displays 

c. One experimenter and 20 operators 

d. Color Display 

e. High resolution B/W display 

f. Ability to record task performance times and accuracy 

g. Ability to record subjective opinions 

h. Results of Display Experiment I 

i. Set of newly designed symbols 

4. GENERAL CONTEXT 

a. Concept and Need 

This is a second step needed in the experimentation sequence to 
show the validity of using NTDS symbols or other appropriately de- 
signed symbols. Given the results of Display Experiment I, this 
experiment is needed to compare the best color combination from 
the first experiment versus the use of black and white low and 
high resolution displays. 

b. General Situation for the Experiment 

The design is very similar to that of Display Experiment I, where 
set of symbols is a variable (taking the place of Color Com- 
bination) and level of resolution is a variable (taking the place of 

blink - no-blink). 
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5. EVALUATION: Evaluation and data analysis will be almost identical to 



the concepts and techniques used in Display Experiment I. Anticipated results 
might be that the analysis will show a difference in operator performance when 
using different symbology, or it might show that low resolution color is far 
more effective than high resolution black and white presentations. One might 
also expect to find interactions showing that high resolution works best for 
certain situations and vice versa. 
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C. DISPLAY EXPERIMENT III 



1. EXPERIMENT TITLE : Entry Devices and Alphanumeric Displays 

2. OBJECTIVE : To evaluate the usefulness of various methods for dis- 

playing alpha numeric information and the use of entry devices under a variety 
of situations. 

3. RESOURCES REQUIRED : 

a. ACCAT Testbed 

b. 24 naval situation displays 

c. One experimenter and 20 operators 

d. Function keys and regular keyboard 

e. Ability to display alphanumeric information on main display 
and side displays 

f. Ability to record performance times 

g. Ability to record subjective opinions 

4. GENERAL CONTEXT 

a. Concept and Need 

This experiment is similar in concept design to Display Experiment 
I with 24 naval situations except that the variables here are: 

1) Entry device 

a ) Keyboard 

b) Function Buttons 

2) Alphanumeric Display 

a) On display screen anywhere 

b) On split portion of screen 

c) On a status display beside the main display 

b. General Situation for the Experiment 
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The design is similar to that of Experiment I where entry devices 
and types of alpha numeric display are now the main variables to be 
investigated under different situations. 

5. EVALUATION : Evaluation and data analysis will be almost identical 

to that of Display Experiment I. Anticipated results might show difference 
between: 

1) Keyboard versus Function Button Entry 

2) Split Screen Display versus Side Display of Alpha Numeric Information 



D. DISPLAY EXPERIMENT IV 



1. EXPERIMENT TITLE : Map Projection and Track History Display 

2. OBJECTIVE : To evaluate mercator versus polar type map projection and 

different methods of displaying track history. The objective is to show the 
effects of these variables on transferring information to the operator. 

3. RESOURCES REQUIRED : 

a. ACCAT Testbed 

b. 24 naval situation displays 

c. One experimenter and 20 operators 

d. Ability to display mercator and polar map projections 

e. Ability to display track history 

f. Several alternative designs for displaying track history 

g. Results of previous experiments 

h. Ability to record operator times and accuracy 

i. Ability to record operator subjective opinions 

4. GENERAL CONTEXT 

a. Concept and Need 

The general concept of this experiment is to evaluate the effect 
of different types of map projections and track history presenta- 
tions on operator performance. The experiment is needed to help 
define which methods are best for future systems. 

b. General Situation for the Experiment 

The design is similar to that of Experiment I, where types of map 
projection and types of track history are the variables to be 
examined under 24 different naval situations 

5. EVALUATION : Evaluation and data analysis will be almost identical 

to that of Display Experiment I. Anticipated results might show mercator pro- 
jection to be best, along with a certain design for track history. Results 
might also show mercator projection to work best for certain situations but 
polar display to be best in other situations. 
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II. TECA EXPERIMENTS 



EXECUTIVE SUMMARY . Experimentation on the capability and the utility of the 
Threat Evaluation and Countermeasures Agent (TECA) as an artificial intelligence 
user aid for threat evaluation at an afloat task force command and control 
center (TFCC) will be phased to correspond to the development of the TECA tech- 
nology. The experiments described here will commence during the latter half 
of FY 1977 when the test resources are assembled. 

Experimentation will be performed in stages ranging from simple static 

debug/validation experiments which concentrate on the capabilities of the TECA 

technology and require few resources to more extensive technical evaluations 

of TECA with a dynamically changing data base in a controlled environment and 

2 

concluding with dynamic operational evaluations in simulated C environments 
pitting orange and blue task forces against each other under various scenarios 
using WES. The staging of experimentation will allow the ACCAT team and the 
experimentation personnel to gain experience with the threat evaluation prob- 
lem, TECA, WES, the display devices and other test resources, and it will pro- 
vide adequate time for the development of WES and TECA. The results of the 
early phases will provide feedback that may lead to improvements in TECA and 
WES. 

The initial experiments will consist of inputing various static threat 
situations concerning the status of blue and orange task forces into the ACCAT 
data base and observing if TECA will signal the existence of the threats, 
provide the threat conditions, and give recommended courses of action (COA). 

The test objective will be to evaluate the capabilities of TECA. Any problems 
detected in this validation stage with TECA should be corrected before pro- 
ceeding to the technical evaluation of stage two. 

2 

The stage two experiments will be conducted in a more realistic C 
environment with a dynamically changing data base, but they will still focus 
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on the capabilities of TECA. Three detailed scenarios will be used in conjunction 
with WES to generate threat situations and data-base updates to present a wide 
range of threats to be processed by TECA. The primary objective of these tests 
is to evaluate the capability of TECA under dynamically changing circumstances 
and to probe for the limits of its capabilities, e.g., what threat intensity is 
required to "overload" TECA, or what is the length of time from the instant 
threat occurs until TECA signals the threat. 

In the third stage of testing, WES will again be used with orange and 
blue task forces which are composed as closely as possible, within the capa- 
bilities of available test resources, with equipment (sensors, weapons, etc.) 
and platforms that are projected for the 1980-1985 timeframe. Experimental 
emphasis will shift from a technical evaluation to an operational evaluation 
of the value of TECA to the decision maker. The objective will be to determine 
if TECA improves the commander's ability to make rapid and accurate decisions. 
Operationally realistic scenarios such as those developed at OPNAV or the Naval 
War College will be used in the evaluation. Each scenario will be replicated 
with different players with some trials having the blue forces operating with 
TECA and some trials without TECA. Operational measures of effectiveness, which 
will depend on the scenarios under play, will be used to assess the operational 
utility of TECA. In addition, the players and the umpire team will be subjected 
to post-exercise interviews and questionnaires to get subjective evaluations 
of the worth of TECA. 

Prior to the initiation of the formal experiments, ACCAT personnel should 
perform pilot trials to make sure that the designed experiments test over a 
range of conditions that will be both realistic and informative and to head off 
any problems with the test resources. 
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1. EXPERIMENT: 



a. TITLE: Stage-One TECA Experiments (Static Evaluations) 

b. NUMBER: II-l 

2. OBJECTIVE : Stage-one experiments will evaluate the capability of TECA to 

identify, in a static environment, the threats that confront an afloat task 
force, to describe the threat conditions, and to provide recommended courses 

of action. These experiments will establish the types of threats that TECA can 
identify and will assess the timeliness of the TECA warnings. The experiments 
will also serve to provide feedback through which TECA can be refined. 

3. RESOURCES REQUIRED : 

a. TECA and associated computer hardware 

b. ACCAT data base 

c. 24 threat situation data plates 

1) Each data plate will be a data snapshot describing a threat 
situation at a given time. A given plate will include various 
types of threat conditions which confront the blue forces. The 
actual threats will correspond to the state of development of 
TECA; i.e., the plates for the initial tests will include only 
those threats that TECA will accommodate through phases 1 and la. 

2) Each data plate will describe the situation concerning the plat- 
forms, positions, courses, speeds, sensors, weapons, fuel state, 
etc. in a state vector as would be generated by WES. However, WES 
will not be needed to generate the plates. Instead, the state 
vectors will be read directly into the PDP-10. Table VI shows 
the type of information required for one of the data plates. 

Figure 4 gives a graphical description of the situation and Figure 
5 shows a blown-up view of the situation. A graphical description 
will be given for each plate for comparison with the TECA display 
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PLATFORM 
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PLATFORM 












NAME 


TRUXTON 


SPRUANCE 


KNOX 


ROBISON 


LOS ANGELES 


TYPE 


CGN 


DD 


FF 


DDG 




LAT 


34-14 


34-09 


34-04 


34-00 


33-00 


LONG 


128-48 


128-48 


128-42 


128-51 


128-30 


COURSE 


045°T 


045°T 


045°T 


045°T 


010°T 


SPEED 


18 KTS 


18 KTS 


18 KTS 


18 KTS 


5 KTS 


GAL. OF FUEL 


12,000 


12,000 


100 


12,000 


100,000 


CF 


55 


55 


50 


55 


1 


SURF. MISSILES 


SMI - 25M 


— 


SMI - 25M 


40 TARTAR 
15 M 


SUBROC 


GUNS 


1 9 MI 


9 MI 


9 MI 


9 MI 
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FIGURE 4. GRAPHICAL DESCRIPTION OF DATA PLATE #1 
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and to aid in the human evaluation of threats. 



3) Personnel at the Naval Postgraduate School will aid in the de- 
velopment of the data plates by suggesting some of the threat 
situations. Primary responsibility for the development of the 
plates must reside in the ACCAT team since they have the hardware 
resources needed to generate the data plates and they are in close 
touch with the development of TECA. 

d. Display terminal 

1) Initially a monochromatic alphanumeric display terminal will be 
sufficient for the stage-one tests. 

2) At the conclusion of Phase la of the TECA development, the 
GENISCO display system will be needed to test TECA's capability 
of displaying locations, identities and motion vectors of both 
friendly and threat platforms and their respective weapon and 
sensor coverage areas at the present time and at specified 
future times. 

e. Personnel 

1) The tests will require a threat assessment team to manually 
evaluate each data plate and identify all threats. This human 
threat evaluation will be done prior to the tests of TECA. 

2) Two subjects will be needed during the experiments; one to enter 
the plates into the data base and one to monitor the TECA-driven 
display. 

3) Post-exercise analysis personnel will be needed to compare the 
threat situations with the TECA descriptions of the situations 
and to analyze the time lags. 

f. Timing device to measure the elapsed time between insertion of the 

data into the data base and the TECA warning signal. (A software 
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clock is suggested to record the times automatically. This will require 
a small amount of programming, but will give more reliable measurements, 
g. Software 

1) Software must be written to input the data plates into the data 
base and to send a stimulus to notify TECA of the data-base update. 

2) Software will be required to measure and record the times of the 
data-base updates and the times at which TECA sounds its warnings. 

3) Software will be required to record all TECA-generated output 
messages. 

4. GENERAL CONTEXT : 

a. Concept and Need 

TECA is being developed to interact with the command and control data 
base, with computational models, and with a graphics display system to 
support the commander's decision processes. In order to assess the 
value of the TECA technology we must determine its capabilities and 
its shortcomings. For TECA to be useful to a decision maker in a 
command and control environment it must be able to identify and evaluate 
all threats, describe the threat conditions, recommend countermeasures 
and display the situation in a timely manner. These experiments will 
assess those capabilities of TECA. Because of the evolving nature 
of TECA and the short time period between the delivery of the ACCAT 
hardware and the scheduled beginning of experimentation, the tests 
will begin with relatively simple experiments requiring a minimum of 
test resources. 

b. General Situation and Scenarios 

Twenty-four threat situation data plates will be developed to test 
the capability of TECA. The plates will be selected so as to include 
as many of the phase 1 threat situations, as is feasible, that TECA will 
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be ready to evaluate. The plates will contain multiple threats each of 
which would constitute a threat to some member of the friendly forces. 

The pilot trials should be used to explore the number and types of threat 
situations that should be contained in a given plate that are consistent 
with the state of the TECA technology. 

A threat situation data plate will be selected at random, 
its number recorded and the data input into the ACCAT data base. The 
time of input should be recorded automatically by a software clock. 

An observer, unaware of the contents of the data plate or the input 
time, will monitor the display terminal receiving information from 
TECA. When TECA signals the existence of a threat, the time of the 
signal, the threat conditions and the recommended countermeasures will 
be recorded automatically. 

The data plates need not be replicated since the outputs for a given 
data plate will be exactly the same from replication to replication. 

Thus, only 24 trials will be run. Allowing approximately five minutes 
per trial and a rest period half the way through the trials, the stage- 
one experiments will require no more than three hours to run. 

5. EVALUATION : 

a. Data Collection 

The following data should be collected for each trial: 

1) The data plate identification number 

2) The time of entry of the data plate into the data base 

3) The time at which TECA signals the existence of the threats 

4) The time at which the TECA evaluation of the threat situation has 

terminated as measured when the output at the display terminal 
ceases 

5) The threat conditions and recommended countermeasures given by 
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TECA (a "hard copy" printout of the displayed information). From 
these data Summarized in Table VIII)and the human-assessed threats 
performed prior to the stage-one experiments, we can determine time 
lags, false alarms, missed threat identifications, and TECA's threat 
prioritization. In addition we can evaluate the adequacy of the 
threat descriptions, the recommended countermeasures, and the 
displayed information. 
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TABLE VII: DATA SHEET FOR STAGE-ONE TECA TESTS * 
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The human-assessed threats should be documented for comparison 
with the TECA output. For example, for the data plate described 
by Table VI and Figures 4 and 5 the threat assessment might be as 
described in Table VIII. 



PLATFORM 


THREATS 


TRUXTON 


1. Within effective range of SSN-11; BRG 355°T, 6 NM 

2. Within effective range of SSN-11; BRG 335°T, 13 NM 

3. Within effective range of SSN-3; BRG 027°T, 24 NM 

4. Within effective range of 76MM Guns, BRG 335°T, 6 NM 


SPRUANCE 


1. Within effective range of SSN-11; BRG 317°T, 15.5 NM 

2. Within effective range of SSN-3; BRG 003°T, 27 NM 


KNOX 


1. Within effective range of SSN-11; BRG 344°T, 17 NM 

2. Warning: On present course and speed, will be within 
effective range of SSN-3 in 5 minutes. Threat platform 
now BRG 012°, 32.5 NM. Recommend come starboard 45°. 

3. Warning: Will fun out of fuel in approximately 2 hours 


ROBISON 


1. Warning: On present course and speed, will be within 

effective range of SSN-3 in 40 minutes. Recommend come 
starboard 10°. 



TABLE VIII: THREAT ASSESSMENT DATA SHEET 
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b. Analysis 



The analysis undertaken for the stage-one experiments will be con- 
centrated in two primary areas: 

1) Comparison of the TECA output with the human assessments of the 
threats 

2) Determination of the times required for TECA to process different 
types of situations 

Post-test analysis will examine the data sheet to ascertain if TECA 
accurately evaluated all phase-one threats and gave warnings in a 
timely manner. The recommended countermeasures will be examined 
primarily for debugging purposes. The accuracy of the TECA threat 
descriptions and COAs will be evaluated, 
c. Anticipated Results 

It is anticipated that TECA will signal the existence of threats with 
very little time delay and that the threat conditions and countermeasure 
will be presented accurately. For those cases where complex multiple 
threat situations are present simultaneously, the tests may indicate 
the need for a threat prioritization routine and/or better methods 
for displaying threat information. 

6. COMMENTS AND SPECIAL INSTRUCTIONS 

Care should be taken to include in the data plates all of the types of 
threats that TECA has been designed to handle in realistic situations. 

As suggested above, these tests can be automated for the most part with 
no need to replicate. However, if Phase 1A of the TECA development is on 
schedule and the GENISCO display can be used in these experiments, we 
recommend that several different operators be subjected to the complete 
set of plates and be asked to evaluate subjectively the displayed informa- 
tion. Such feedback will provide useful information concerning refinements 
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of TECA and will provide direction for follow-on display experiments. 

It is possible that the threat situation data plates could serve double 
duty by also being used in the display experiments. There are some dif- 
ferences in the input mechanisms, but perhaps software could be developed 
to allow the TECA plates to be used for the display experiments. The process 
of developing the data plates is time consuming and many of the experiments 
will require such static data plates. Therefore, it is important that the 
ACCAT team have the capability of producing such experimental resources. 



43 



B. STAGE-TWO TECA EXPERIMENTS 

1. EXPERIMENT : 

a. TITLE: Dynamic Technology Assessment 

b. Number I I -2 

2. OBJECTIVE : To assess the technical capability of TECA actions on a dynamica 

changing data base generated by creating threat situations through the exercise 
of WES; to determine the "limits" of the capability of TECA; and to obtain sub- 
jective opinions about TECA. 

3. RESOURCES REQUIRED : 

a. TECA and associated computer hardware 

b. WES 

c. ACCAT data base 

d. GENISCO display system for blue team and a display terminal for the 
umpire team 

e. WES generated tapes for three scenarios 

f. Four static threat situation data plates 

g. Personnel 

1) Umpire team (threat assessors) 

2) Two WES trained operators 

3) Six observers of the TECA output 

4) Test director 

h. Software to measure and record the times at which TECA messages are 
output and the times designated by the observers, and software to 
record the information displayed by TECA 

i. Tapes of the WES outputs of three scenarios involving blue and orange 
task forces 

4. GENERAL CONTEXT : 

a. Concept and Need 



A complete technological assessment of the capability of TECA must 



include an evaluation of TECA in a realistic operational environment 
with a dynamically changing data base. The experiments must subject 
TECA to a wide variety of threat conditions under varying levels of 
threat intensity in an effort to exercise all capabilities of TECA 
and to try to saturate TECA. The tests should attempt to probe the 
"limits" of the capability of TECA. 
b. General Situation and Scenarios 

Three scenarios will be generated to present a wide variety of threat 
conditions and varying levels of threat intensity. Two WES trained 
operators, representing the blue and the orange forces, will play out 
the war games without TECA according to the scenarios and a taped 
record of each game will be made. (The taped records will consist 
of the files of information corresponding to the periodic WES updates 
of state vectors of the war games.) Pilot trials should be conducted 
to determine the appropriate levels of threat intensity to be included 
in the war games. The umpire team will determine those situations and 
times that the blue forces were threatened by the orange forces during 
the war game. 

Each experimental subject will go through a short TECA indoctrination 
session during which he will view four of the static threat situation 
data plates developed for the stage-one tests. The subject will also 
be told that he will monitor a display of the blue forces receiving threat 
information from TECA and that he should signal after each TECA warning 
when he understands the threat situation, but that he will have no control 
over the actions of the blue forces. 

During an actual experimentation run the tape of one of the three 
WES games will be played back with one of the six observers monitoring 
the blue forces which receive threat warnings and messages from TECA. 



45 



The times at which the data base is updated, the times at which TECA sig 
rials the threats, the threat conditions as described by TECA and the re- 
commended countermeasures should all be recorded automatically. In 
addition, the blue observer will signal when he feels that he comprehend 
the threat situation by depressing a specified key on a display keyboard 
The times that the key is depressed will be recorded automatically. 

The three scenarios should each have a duration of approximately two 
hours. Each of the six observers will observe all three scenarios. For 
each operator the order of the scenarios will be sequenced differently 
to balance out any temporal effects. The following schedule of trials 
should be followed: 



Operator 

1 

2 

3 

4 

5 

6 



Scenario Sequence 
(1. 2, 3) 

(1, 3, 2) 

(2, 1, 3) 

(2, 3, 1) 

(3, 1, 2) 

(3, 2, 1) 



At the conclusion of each trial the observer will undergo a de- 
briefing period and a questionnaire will be administered to obtain the 
observers' subjective assessment of TECA. Sample questions are shown 
in the next section. 

5. EVALUATION : 

a. Data Collection 

The following data will be collected for each scenario: 

1) The times at which that data base is updated 

2) The times at which the threat conditions were signalled by TECA 
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3) The threat conditions as described by TECA 

4) Recommended countermeasures 

In addition, for each trial the time interval between each data-base 
update and the observer's signal of his understanding of the situation 
will be measured and recorded. 

These data will be augmented by the threat assessments made by the 
umpire team and the subjective appraisals by the observers to evaluate 
the following information: 

1) What is the delay time for TECA to acquire data and issue threat 
warnings? 

2) What are the saturation points of TECA? 

3) What are the saturation points of the subjects? How many threats 
can a subject comprehend? 

4) What are the delay times for the subject to understand the threat 
situations? 

5) Were there any false warnings or missed threats? 

6) Were there any learning effects within observers? (Is there a 
sequence by observer interaction?) 

7) Is there a scenario by operator interaction? 

8) What are the observer's opinions about TECA? 

The questionnaires given to each operator at the conclusion of each 
trial will consist of a few questions to identify the subject, determine 
his prior C experience, and to identify the scenario just completed. 
These boilerplate questions will be followed by no more than 10 ques- 
tions soliciting the observer's opinions about the following: 

1) The TECA signalling device 

2) The TECA output method 

3) The type of information given by TECA 

4) The most useful information given by TECA 
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5) The least useful information given by TECA 

6) Features that should be added to TECA 

7) The value of permitting the observer to selectively suppress some 
of the TECA information 

8) The usefulness of TECA to a decision maker 

9) The type of situation where TECA is most useful 

10) The adequacy of the recommended countermeasures 

b. Analysis 

1) Graphical and tabular displays of the data concerning time lags 
and summary data such as means and variances 

2) Analysis of TECA processing time as a function of the number of 
threats, platforms, and/or types of threats 

3) Analysis of observer comprehension time as a function of the 
number of threats, platforms, and/or types of threats 

4) Effects of learning (trial number) on observer's comprehension time 
and/or subjective opinions 

5) Effects of scenario on observer's subjective opinions about TECA 

6) Comparison between the TECA threat descriptions and those by the 
team of threat assessors 

c. Anticipated Results 

We anticipate that we will not be able to saturate TECA through Phase 
la with the restrictions on update frequency and number of platforms 
imposed by WES. Later, as more threat evaluation tasks are given to 
TECA in phases 2 and 3, we may be able to generate saturation points 
for TECA. We anticipate that the threat warnings will be produced in a 
timely manner and that the observers will consider TECA to be a valuable 
command and control decision aid. 

These experiments should provide a thorough examination of the 
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capabilities and limitations of TECA. In addition, they should provide 
an inkling of the acceptability of TECA to the user population. 



6. COMMENTS AND SPECIAL INSTRUCTIONS 

Tapes of the three scenarios are requested for these experiments so that 
a given game can be replicated exactly with zero variance for the different 
operators and so that the umpire team (threat assessors) can determine 
one time for each scenario what threats exist. Furthermore, having the 
games on tape permits the umpire team to run through the same game several 
times to thoroughly check out all threat conditions. Also, having the 
games on tape would allow trials to be run without requiring WES-trained 
operators to input the instructions required by a detailed script. In 
addition, with the games on tape, an observer could monitor the blue forces 
in one or more of the scenarios in a run without TECA and make comparisons 
with the exact same run with TECA. Finally, if the runs were not available 
on tape, very detailed scripts describing all of the actions of both the 
blue and the orange forces would have to be written for each of the scenarios 
to accomplish the objectives of this test. 

C. STAGE-THREE EXPERIMENTS 

1. EXPERIMENT : 

a. TITLE: Operational Evaluation of TECA 

b. Number 1 1 -3 

2. OBJECTIVE : To determine an operational evaluation of the military utility 

of TECA to a decision maker at an afloat task force command and control 

2 

center in a simulated C environment. To evaluate whether TECA improves 
the decision maker's ability to make rapid and accurate decisions. 

3. RESOURCES REQUIRED : 

a. TECA and associated computer hardware 



b. WES 



c. Three display terminals (blue, orange and umpire teams) 

d. ACCAT data base 

e. Five operationally realistic scenarios 

f. Personnel 

1) Six teams to serve as blue and orange forces 

2) One umpire team 

3) Two WES operators 

4) Test director 

g. Software to measure and record the times at which TECA messages are 
output, the TECA messages, the times at which instructions were given 
by the TECA-aided decision maker, and a record of the instructions. 

h. Experimental Command Center 
4. GENERAL CONTEXT 

a. Concept and Need 

An assessment of the value of TECA as a decision aid in an operational 

command and control environment can only be made by evaluating whether 

TECA enables a decision maker to better understand the threat situation 

which faces his forces and to make better and more timely decisions. 

These experiments will pit two opposing decision makers in a simulated 
2 

C environment and seek to measure the value of TECA by comparing the 
decision makers' performances operating with and without TECA. 

b. General Situation and Scenarios 

The operational evaluations of TECA will require five operationally 
realistic scenarios consiting of blue and orange task forces like those 
projected for the 1980-1985 time frame. The scenarios and the initial 
conditions should be selected so that a complete game can be played 
out in approximately three hours. The scenarios will be acted out with 
each team given the flexibility of exercising complete control over 
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their forces as long as they do not countermand their ordered missions. 
(The actual commands will be entered into WES by the two WES operators.) 
For each scenario, an operational MOE or multiple operational MOEs will 
be selected and used to assess the operational utility of TECA. 

Because of the free-play flexibility given to the blue- and orange- 
force commanders, the war games and the resulting outcomes will likely 
vary significantly from trial to trial. Consequently, each scenario 
will be replicated three times with the blue forces playing one game 
without TECA to establish a baseline for comparison and two games with 
TECA. The tests will be set up so that each team will play each of 
the other five teams exactly one time and each time will play some of 
its games with TECA and some of its games without TECA. The trial 
matrix should be as shown in Table IX. The first entry of the pair 
corresponds to the orange team and the second entry of the pair cor- 
responds to the blue team. The "T" following the blue team indicates 
that the blue team for that game will be aided by TECA. The restrictions 
on randomization imposed by this experimental design have been incorpora- 
ted in an effort to balance out the learning and the team effects. 

All replications of scenario one should be performed, then the three 
replications of scenario two, etc., until all fifteen games have 
been played. 
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TABLE IX. TRIAL MATRIX FOR STAGE-THREE EXPERIMENTS 
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A brief indoctrination period lasting approximately thirty minutes 
should precede each team's first trial. In this indoctrination period 
each team will be instructed as to what it will be required to do durin 
the tests and a demonstration will be given of WES and TECA. In addi- 
tion, each team will be instructed prior to each trial about the 
scenario, their missions, and the measures of effectiveness. 

5. EVALUATION : 

a. Data Collection 

The following data will be collected for each trial: 

1) The times at which the data base is updated 

2) The times at which the threat conditions were signalled by TECA 

3) The threat conditions as described by TECA 

4) Recommended countermeasures 

5) Actions taken by the decision maker 

6) The times at which the decision maker's instructions were input 
to WES 

7) The data required to determine the selected operational measures 
of effectiveness 

These data will be augmented by subjective evaluations by the umpire 
team and by the observers. Each team playing with TECA will fill out 
a questionnaire like that used in stage two. 

b. Analysis 

1) Data summaries will be made for each scenario and tabular displays 
will be made of the MOEs 

2) A correlation analysis will be made of the time sequence of TECA 
warnings with the time sequence of actions taken by the decision 
makers to try to determine how much the commanders utilize TECA 
and what information is most useful to them 
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3) A comparison will be made of the game outcomes as reflected by the 
MOEs using analysis of variance to see if there is any significant 
difference due to TECA. 

4) An analysis of variance will also be performed to test if there were 
significant interactions between scenarios and TECA. 

5) A comparison will be made of the game outcomes and the subjective 
appraisals of TECA to see if there was a relationship between how 
well the blue forces performed and how much they liked TECA. 

c. Anticipated Results 

It is anticipated that there will be large variance in the overall end- 
of-game operational measures of effectiveness. There will perhaps be 
so much ''noise" in the MOEs that any signal due to TECA will not be 
discernible. Thus, the subjective evaluations of the teams will probably 
be very important. Their evaluations may provide the only discrimination 
in the test, for there may be no quantitative basis for discrimination 
as to the value of TECA. 

The correlation analysis of the decision maker's actions and the 
TECA threat warnings should reveal the type of information that is most 
useful to the decision makers and how much a decision maker may grow 
to rely on TECA. 

If the statistical noise due to team differences, learning, randomness, 
etc. is not so great, it is anticipated that the analysis of variance 
will reveal that there are significant differences due to TECA and the 
scenario, and that there are significant TECA by scenario interactions. 

Overall, these experiments should yield important information about 
the acceptability of TECA to the decision maker and its utility to him. 
COMMENTS AND SPECIAL INSTRUCTIONS 

The usefulness of TECA to a decision maker may depend strongly on the manner 



in which TECA outputs are displayed to the blue forces. This might be 
especially true in intense conflict situations if a large queue of threat 
conditions builds up at a given time. Thus, some experimentation in the 
display area should precede stage three of the TECA experiments to determin 
a good display candidate to be used in these trials. The blue forces displ 
terminal should be the GENISCO display. The display system is an integral 
part of TECA so we should evaluate TECA with the best display system 
avai labl e. 

Much of the data collected in these experiments can be used to evaluate 
the capabilities of the TECA technology as was done in stages one and two. 
Unless some specific problems with TECA are indicated in the trials, the 
type of analysis done in stages one and two will not be repeated here. 

The data will, however, be available for analysis. A tape record should 
be made of the inputs and outputs of each game so that evaluation personnel 
can reproduce a given trial and conduct an autopsy of what happened at any 
point in the game. With a tape record a lot of potentially valuable 
"what if" types of analyses could be conducted. 

Much of the success of the experiments will depend on the scenarios, 
the starting conditions and the measures of effectiveness that will be used. 
The scenarios used should be operationally realistic, and the starting 
conditions and measures of effectiveness must be selected to exercise a 
wide range of the TECA capabilities and to yield useful information. 

The scenarios required for the TECA experiments should not be developed 
just for the purpose of exercising TECA. Instead, the blue and orange 
task forces should reflect as closely as possible, within the constraints 
of the resources, the compositions of those forces projected for the 1980- 
1985 time frame and the types of missions that would be probable. We 
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believe that the ACCAT team will have to conduct some preliminary pilot 
trials to select appropriate scenarios, starting conditions and measures 
of effectiveness Scenarios are required for just about all of the ACCAT 
experiments and their development could easily become a bottleneck to 
experimentation time schedules. It is important that the ACCAT team 
acquire resources for developing scenarios for testing the command-and- 
control technologies. 

Finally, the subjects used in the experiments will be very important. 

For scientific experimentation purposes we need subjects who have command 
and control experience as decision makers. This would probably require 
Captains or Admirals to serve as subjects. One of the most common criticisms 
of experiments which attempt to determine the operational utility of a sys- 
tem is that the "operators" did not reflect the potential user population. 
This is especially a subject of criticism when the evaluations depend 
heavily on subjective appraisals. We realize the difficulties in getting 
the type of person actually needed for these experiments but we suggest 
that such an effort be made. The point at issue here is whether these 
tests are really to be viewed as scientific experiments or demonstrations. 

If they are to be considered as experiments, then we need operational 
realism so that our conclusions will be credible. 
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III. LADDER EXPERIMENT 



EXECUTIVE SUMMARY . A static "discrete slide" of data contained in Blue File 
will be accessed using INLAND with its natural language input and IDA with 
its simple query language. The objectives are to assess training/learning 
requirements for operators with the two options, as well as to compare their 
times to recover the information requested, compare the respective error 
rates in formulation of query messages and to obtain subjective evaluations 
of the two languages. 

For follow-on evaluations it is desirable to use a dynamic data base 
situation, with a free play "scenario" involving operators interacting with 
decision makers and the query systems, under varying degrees of stress and 
query traffic. It would be desirable, therefore, to develop a method of 
making WES output accessable to LADDER through the Blue File; we recommend 
this task be undertaken by ACCAT as a high priority goal. 

1. EXPERIMENT : 

a. TITLE: LADDER Experiment to Evaluate Relative Performances with 

INLAND and IDA 

b. Number III-l 

2. OBJECTIVE : To assess operators' ability to access a limited data base 

using two levels of query language; to evaluate operators' ease of learning 
with the two levels; to determine effects of differences in query flexibility 
at the two levels; and to obtain operators' subjective opinions about the 
two approaches to accessing the Blue File data base. 

3. RESOURCES REQUIRED : 

a. Computer for use with LADDER software 

b. Display and Keyboard input device 

c. LADDER software, together with software required to measure and 
record times between events, to allow keyboard input, to display 
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information (operator instructions, questionnaire, etc.) and to 
record operator responses. 

d. A "Blue File"-like data base containing information to be accessed 

e. Two query lists, with programmed entry times 

f. 8 Operators (needn't be familiar with LADDER) 

g. Test director, LADDER language instructor 

h. Instruction and training materials for teaching operators the IDA 
query language and the LADDER natural language 

i. Questionnaire for presentation on display 

j. Typing test, to be administered through display, with automatic 
recording of completion time 

k. Operator briefing and prompting information for "directing" trials 
GENERAL CONTEXT : 

a. Concept and Need 

The need for improvement in the man/machine interface has led to 

development of LADDER for use in querying certain data bases. 

It is not known, however, what level of query language might pro- 

2 

vide the best access to data bases for C situations. In this 
experiment, two levels of query language, both using the same 
data base, are assessed and compared. 

b. General Situation and Scenario 

Operators are trained to use the IDA and INLAND query languages. 
Times required for each operator to become proficient with each 
language are recorded, for subsequent analysis. A training "in- 
structor", together with training materials developed by ACCAT 
personnel are to be used; possibly all training can be accomplished 
with an interactive display package. Care should be taken to train 
half of the operators first with IDA, the other half with INLAND 



first, since there is probably carryover from one area to the other, 
and we wish to evaluate the differences in training requirements 
with the two languages. 

Operators receive displayed requests for information with request 
added to the operator's "List" or queue in accordance with a pre- 
selected schedule. Two such programmed schedules are required, 
say program "A" and program "B". The operator is instructed to 
use the first-in-first-out discipline in processing requests. The 
programmed schedule of request arrivals is designed so as to provide 
varying load (traffic volume) and difficulty (translation of dis- 
played "verbal" request to INLAND or IDA query form). This is done 
so as to enable evaluation of the operator's "learning" and fatigue 
effects, the operator's responses to busy periods, and the operator' 
request handling capacities. A number of operators (eight is sug- 
gested) would each receive two such trials, one with each query 
language , in accordance with the schedule shown below. The opera- 
tors will be trained so they are competent with both the INLAND 
interface method and IDA. The trials should last roughly two hours, 
including an initial 15 minute "calibration" period and a final 15 
minute "debriefing" period. Each operator remains at his task 
until he has served all the programmed requests for the trial. 

The total time required for a trial would vary due to differences 
in operators and query languages. 

The calibration period at the beginning of a trial is used to 
bring the operator "up to speed" with the particular query language 
to be used in the trial, and to administer a short typing task. 

The typing speed and accuracy of the operator will be measured by 
displaying a short passage and asking the operator to key it into 
his display keyboard. The operator will be required to edit his 



58 



response until it contains no errors; his score is total time re- 
quired. This information will later be used to determine its degree 
of correlation with operator performance using these query methods. 
The typing test will be administered only at the beginning of the 
first trial encountered by each operator. The calibration period 
will begin with a displayed briefing of the experiment, the opera- 
tor's role, how long it will take, and instructions for performing 
the trial . 

The debriefing period will be used to administer a questionnaire 
to each operator with the display, following the operator's second 
(and final) trial. The questionnaire is designed to determine the 
operator's subjective assessment of each query method. A sample 
questionnaire is shown below. 

The preprogrammed schedules, A and B, should have about the 
same length, frequency of arrivals profile and difficulty of re- 
quests mix. The requests should all be processable using IDA. 

The mix of requests should "span" as much as possible this set 
of feasible queries. It is possible a trial and error pilot pro- 
gram will be required before reasonable programs can be designed, 
so that request arrival rates and request difficulties are not 
too easy nor too difficult. 

A schedule of the following form should be used in conducting 
the experiments ("a" denotes operator's first trial, "b" his 
second trial; numbers in table are operator numbers). 
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Program 



A B 

IDA 

Language: 

INLAND 



Thus, for example, operator 3 first has a trial with INLAND using 
program B; later he has a trial with IDA using program A. 

5. EVALUATION: 

a. Data Collection 

It is proposed to measure or record the following during each 
trial : 

1) Time of arrival of each request 

2) Time of completion of each query message 

3) The query message itself (as typed by the operator) 

4) Use of error recovery feature 

5) Time of receipt of answer 

From these data one can determine queue lengths, arrival rates 
of requests, times required to prepare query messages, flow rate 
of completed messages, error rates in message preparation, nature 
of errors in message preparation, each as a function of time in 
the trial as well as integrated into corresponding overall measures 
for the trial. Analyses with these measures and the calibration 
and debriefing data in turn can give: 

1) Amount of response time attributable to query formulation 

2) Paired comparisons of performances with two query languages 
(paired on operator "skill") 

3) Time effects during the trials (learning, fatigue, etc.) 

4) Saturation points of operators 



la, 3b, 5a, 7b 


2b, 4a, 6b, 8a 


2a, 4b, 6a, 8b 


lb, 3a, 5b, 7a 
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5) Why unsuccessful query attempts occurred 

6) Response of operators to queue length (in terms of flow rates 
and error rates) 

7) Differences among operators 

8) Effects associated with typing ability, amount of interactive 
terminal experience, etc. 

9) Subjective opinions of operators 

b. Anticipated Statistical Analysis of Data : 

1) Graphical and tabular displays of data and data summaries 

2) Analysis of operator differences (test of no difference, 
variance estimates, operator X technology interaction 
characteristics, maximum process rate capabilities) 

3) Effects of learning, fatigue (perhaps by regression on time 
or request number) 

4) Effects of queue length (regression of error rate on queue 
length, and of operator process time on queue length) 

5) Correlation analysis between: prior experience and various 

measures of performance (such as error rate, processing time); 
typing ability and various measures of performance; training 
time required and various measures of performance 

6) Paired comparisons of various measures of performance under 
INLAND and IDA 

7) Degree of association between subjective opinion and various 
measures of performances 

8) Analysis of training requirements for INLAND and IDA languages, 
together with evaluation of differences among operators 

c. Anticipated Results 

We expect to find INLAND easier to learn to use, but IDA to be 
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more flexible in the queries it can handle and to be less time 
consuming in total time to answer an input request for information. 
There will probably be large differences in operators, in terms of 
learning times required, and in resultant skills in using the query 
languages. Error rates by operators will probably increase with 
increasing length of the queue of requests and with fatigue of the 
operators. It is expected operators may prefer the INLAND language, 
although this is not clear. 

Overall, this experiment should provide important information 
about man/machine interface languages and some of the important 
factors affecting data base access activities. In addition, the 
experimentation technology developed should be useful for future 
experimentation at ACCAT. 

COMMENTS AND SPECIAL INSTRUCTIONS 

Early attention should be given to the problem of training operators 
in the two languages, and in collecting useful information about 
training requirement differences, if any. This will require develop- 
ment of a training procedure and training materials, possibly for use 
with an interactive terminal setup. 

It is anticipated the entire test sequence for each operator can 
be largely automated through use of the terminal, with keyboard input 
and display output. This would appear to be an efficient approach, 
well worth the investment in development of the required software 
resources. It is assumed these resources will be developed by the 
ACCAT team. 

Analysis of the data requires recorded information coded and 
correlated by operator, with all pertinent information available within 
each operators' "record". This data must be available in a form 
readable by the IBM system at NPS. At least 30 days are required for 
the analyses proposed. 



A sample questionnaire for administration to operators after their second (and final) 
trial is shown below. 

SAMPLE QUESTIONNAIRE FOR VALIDATION PHASE 

1. What is your name? 

2. What is your rank? 

3. What is your specialty? 

4. How many years have you been involved in command control operations? 

5. How long ago (years) did you first use a data base query language? 

6. Which query language did you like best? 

A. INLAND 

B. IDA 

7. Which query language do you feel is most reliable? 

A. INLAND 

B. IDA 

8. Which query language do you think would be best overall for command control 
use? 

A. INLAND 

B. IDA. 

9. Which query language do you feel is easiest to learn? 

A. INLAND 

B. IDA 



THANK YOU FOR YOUR PARTICIPATION IN THIS EXPERIMENT 



END OF TRIAL 



IV. RAND TERMINAL AGENTS EXPERIMENTS 



EXECUTIVE SUMMARY . Five "terminal agents" (TA's) developed by Rand are availabl 
for initial ACCAT evaluation: RITA (applied to data-base querying), NED (2- 

dimensional text editor), VT (virtual terminal facility), REMIND (alarm clock) 
and MS (message system). These TA's enjoy a symbiotic relationship in a com- 
munications/message/remote data accessing environment. Two experiments are 
proposed, one with a static data base and simulated message source and sink, 
the other with a dynamic data base driven by a WES game. Both experiments are 
designed to allow assessment of the utilities of the TA's, in certain specified 
mixes, under simulated high-intensity situations requiring messages (sending 
and receiving), data base querys, and evaluations of situations by operators. 

The training required for operators to learn to effectively use the TA's, as 
well as generate RITA rules for specified hybred functions will be assessed. 

1. EXPERIMENT : 

a. TITLE: Rand Terminal Agents Evaluation with Static Data Base 

b. Number IV-1 

2. OBJECTIVE : To evaluate the utility of five Rand Terminal Agents, in various 

2 

mixes, to the C operator and decision maker; to evaluate training requirements 

for preparing an operator to effectively use the Terminal Agents; to assess 

the operators use of and interaction with the Terminal Agents in a simulated 
2 

C environment. 

3. RESOURCES REQUIRED : 

a. Computer for use with Terminal Agents software 

b. RITA rules for application to data base querying 

c. 2 displays with one capable of use with the Rand Terminal Agents 

d. Terminal Agents software 

e. Software for use in managing experimentation trials, including display 
presentations of operator instruction, questionnaires, messages. 
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requests for data, etc.; keyboard entry of information (such as inter- 
action with displays, message formulation, operator responses to questions, 
etc. ) 

f. Other input and output hardware and software as appropriate 

g. Umpire for simulating message source and message receiver, and to 
assess operators' accuracy of evaluations 

h. Appropriate data base with sufficient information for reasonable exercise 
of RITA query system in simulated environment 

i. 3 preprogrammed request lists for information for display to operators 

j. Software clock to measure and record times of occurrence of certain 
events 

k. Training personnel and materials for training operators to use Terminal 
Agents 

l . 6 operators 

m. Test director 

GENERAL CONTEXT : 

a. Concept and Need 

Various user aids are being developed to assist in the C? man-machine 
interface. One group of such aids, the "Terminal Agents" group de- 
veloped by RAND, is ready for testing at ACCAT. These aids are intended 
to assist operators and decision makers in the arena. It is important 
to assess the utility of these aids in various situations, and compare 
their usefulness with other possible options. 

This experiment involves first "training" several operators to use 
the Terminal Agents in order to allow evaluation of training requirements. 
Operators will then engage in simulated situations requiring data 
base querying, message receiving and sending, and evaluations. Com- 
parisons of operator performance characteristics with several mixes 



of Terminal Agents will provide information useful for evaluating more 
general man-machine problems as well as assessment of the Terminal Agent 
themselves. 

b. General Situation and Scenarios 

Six operators will be trained to use the TA's. The operators need not 
be used simultaneously in the experiment; trials will be scheduled over 
a long period to accommodate other experimentation requirements, if 
desired. Consequently, the training should be automated as much as 
possible, probably through use of operator interaction on the terminal 
(display, using the various TA's). Written material documenting the 
TA's should also be available for use by the operators. While this 
training is rather informal, it is more than simply a "hands-on" 
trial and error approach, in that training times required by the various 
operators will be recorded. Training should be performed according to 
the schedule shown below. In addition, during the training phase 
(probably near its end) operators will be tasked to write new RITA 
rules for inclusion in the RITA software to perform specialized func- 
tions. Again, time required to accomplish these tasks, and the success 
rates and causes of failure will be recorded. 

Three mixes of TA's are to be compared: 1) RITA only (for data 

base querying), 2) RITA plus Virtual Terminal (VT), and 3) All five 
Terminal Agents. Since there is probably a great deal of carryover in‘ 
learning from each one of these mixes to the others, efficient in- 
vestigation of training requirements dictates concentrating training 
on the specific agents in different orders for the various observers. 

We therefore recommend training operators in the following sequences: 
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OPERATOR 


RITA 


VIRTUAL TERMINAL 


NED, REMIND, MS 


1 


1 


2 


3 


2 


1 


3 


2 


3 


2 


1 


3 


4 


2 


3 


1 


5 


3 


1 


2 


6 


3 


2 


1 



For example, operator 3 first learns to use Virtual Terminal, then 
VT plus RITA, and finally VT plus RITA plus NED plus REMIND plus MS. 

Times required to become proficient (as judged by an umpire), together 
with significant problems encountered by the operators should be recorded 
for each operator in each configuration mix. Finally, a set of two or 
three situations requiring specialized RITA rules to be written, in- 
serted into the RITA package, and exercised with the data base should 
be presented (one at a time) to each operator. Times required to 
write the rule and to successfully query the data base with the rule 
should be recorded by the umpire, as well as recording any significant 
problems encountered by operators in this process. 

After completing the training phase, each operator will engage 
in three "trials" in the experiment, one with each TA mix. Each trial 
involves the operator entering a simulated ongoing high-intensity 
situation. The operator is presented (probably via an auxiliary dis- 
play) a preprogrammed sequence of requests for information, arriving 
in time so as to simulate requests for information by the decision maker. 
The operator will be tasked to write, edit and "send" messages, and to 
receive and analyze messages from a simulated source (perhaps the umpire). 
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mix: 



The operator will be asked questions (via the auxiliary terminal) 
initiated by the umpire at prespecified points (times) in the trial. 
These questions are designed to determine the operators' understanding 
of the situation represented in the scenario, and to simulate questions 
asked by the decision maker. Times required to respond to the question; 
accuracy of the responses, and action taken by the operator following 
receipt of the question will be recorded. 

The six operators will use each of these TA mixes, each under dif- 
ferent simulated C situations. Thus, 3 situations, involving differenl 
but similar programmed request lists and specified umpire messages and 
actions need to be prepared for the experiment. The observers should 
undertake the mixes and programs in the following sequence: 





1 


2 


3 


1 


1, 1 


2, 1 


3, 2 




4, 1 


5, 1 


6, 2 


2 


3, 1 


1, 2 


2, 3 




6, 1 


4, 2 


5, 3 


3 


2, 2 


3, 3 


1, 3 




5, 2 


6, 3 


4, 3 



NOTE: Numbers in table are in the form "observer number, 

trial sequence number". Thus, observer 5 first 
uses mix 1 with program 2, next he uses mix 3 with 
program 1 and finally he uses mix 2 with program 3 
This design constitutes a replicated latin square 
desi gn. 
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The duration of each trial should be about one hour. No observer 
should have more than two trials in one day (it would be preferable 
to have only one trial per day, if feasible). The total duration of 
trial time should thus be no more than about 20 hours; training time 
should not be more than about five to ten hours per operator, preferably 
received over a period of several days. 

EVALUATION : 

a. Data Collection 

Data from the training period and trial periods for each observer are 
to be stored in a single record, together with identification codes, 
trial conditions, etc., necessary for analysis. These records should 
be on cards or tape readable by an IBM computer system. The following 
data are to be obtained and recorded for each observer: 

1) Training times (one for each TA group) 

2) Times to write RITA rules and to access data with themj 
for each trial : 

3) Times of each request arrival 

4) Times and nature of each message arrival 

5) Time and content of each query for the data base 

6) Time of each response to data base query, and accuracy of response 

7) Times and texts of messages "sent" by observer 

8) Umpires judgements of accuracy of analyses of situations by observer 

9) Time to complete typing task (at beginning of observers' first trial) 

10) Data from questionnaires administered to observer and umpire (at 

end of observers' third trial) 

b. Analysis 

Analyses of variance with time and accuracy data from the trials will 
be used to determine whether there were significant differences. Cor- 
relation analysis and regression will be used to associate performance 



characteristics with observer and trial or TA mix characteristics. Ob- 
server and umpire subjective evaluations will be summarized and pre- 
sented. Observer differences will be analyzed, 
c. Anticipated Results 

There will be large variance among observers. Even so, differences in 
performance with the various TA mixes should be discernable. Training 
requirements should turn out to be easily accomplished, with a day or 
so of "hands-on" experience, although writing successful RITA rules 
may turn out to be difficult for most observers. It is expected data 
base querying, message formulation, and analyses of the "situation" 
should all become easier for observers with more TA's, although large 
variance among observers may mask most statistical significance among 
them. 

6. COMMENTS AND SPECIAL INSTRUCTIONS 

Since there is probably a large association between some of the variables 
in this experiment and observers' typing facility, a typing test should 
be administered prior to the observers' first trial. Even if such as- 
sociation were not of interest to ACCAT, it would be important to make 
these determinations so their effects could be used to adjust for this 
potential source of variance among observers. In addition, correlations 
of various "operational" characteristics with this measure might be of 
interest in connection with determining future C2 staffing requirements. 

The test described in experiment III-l can be used for this purpose. 

After each trial, and at each milestone in the training phase, the 
umpires' and observers' opinions should be collected and recorded. At 
the end of each observers third trial, a questionnaire should be administered 
to the observer and umpire via a terminal and keyboard recording system, 
as described for experiment I. The sample questionnaire displayed in 
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experiment III-l could be modified in obvious ways to make it suitable for 
the present experiment. 
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RAND TA EXPERIMENTS (CONTINUED) 

1. EXPERIMENT: 



a. TITLE: Rand Terminal Agents Evaluation with Dynamic Data Base 

b. Number I V— 2 

2. OBJECTIVE : To evaluate the utility of five Rand Terminal Agents to the 

C^ operator and decision maker; to assess the operators' and decision makers' 
use of and interaction with the Terminal Agents in a simulated dynamic 

envi ronment. 

3. RESOURCES REQUIRED : 

a. Computer for use with Terminal Agents software 

b. RITA rules for application to data base querying 

c. Displays and software for playing WES game 

d. Display for use with Rand Terminal Agents 

e. Terminal Agents software 

f. Software for use in managing experimentation trials, including display 
of questionnaires and messages, and recording data obtained in experimen 
(such as times of certain events, message traffic, operator responses 
to messages, etc.) 

g. Other i nput and output hardware and software as appropriate 

h. Software clock to measure and record times of occurrence of certain 
events 

i. Two WES scenarios 

j. Four operator-decision maker teams; one umpire 

k. Test director 

4. GENERAL CONTEXT : 

a. Concept and Need 

The most significant impacts of making TA's available in the command 
center may become most evident in a dynamic, war game situation. In 
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this experiment, operator/decision maker teams play WES games with 
and without the TA's, in simulated stressful, high intensity situations. 
In order to generate a situation requiring a high density of message 
traffic and data base querying, it may be useful to use one game scenario 
which calls for interrupting the TA side (during its play) with a high- 
priority message from the commander-in-chief changing the teams' mission. 
This might simulate a miaguez rescue mission situation, for example, 
with message traffic to and from the commander-in-chief (perhaps played 
by the umpire) directing the new operation. 

This experiment is designed to assess the relative value of the TA's 
in the above described situations, concentrating on the increased ability 
of the team equipped with TA's to handle the man-machine interface 
problems so generated. 

General Situation and Scenarios 

Four decision maker/observer teams play two WES games, two with and 
two without TA's, in accordance with the following schedule: 



With TA's Without TA's 



Scenario: 



1 team 1 (blue) 
team 3 (orange) 

2 team 2 (blue) 
team 4 (orange) 



team 2 (orange) 
team 4 (blue) 

team 1 (orange) 
team 3 (blue) 



The scenarios should present a wide variety of message, query and 
evaluation traffic. The scenarios should build up to intensities 
that probe the limits of the players ability to keep up, even with 
the TA's. Pilot trials might prove useful to determine the appropriate 
levels of intensity to be included in these games. 

The games should be of about two hours duration, and the teams 



should not play more than one game per day. Thus, about 8 to 10 hours 
of game time, spread over at least four days time, are required. The 
teams should have operators that are experienced in using the TA's. 

5. EVALUATION : 

a. Data Collection 

The data recorded during each game will include the following: 

1) Requests by the decision makers (times of occurrence and text) 

2) Responses by the operators (times of occurrence and text) 

3) Message traffic to and from each command center, including times 
messages were received; time messages were sent; times required 
to prepare messages, and message texts 

4) Times requests for data base information are made by the decision 
maker, times queries are finished by operators, times data. base 
informations are received by the decision makers. 

5) Times of game control input by decision makers (for establishing a 
rough measure of game "intensity") 

6) Times and durations of use of any of the five TA's 

7) Subjective opinions of decision makers, operators and the umpire, 
obtained after each game. (This would probably be most efficiently 
done via display delivered and keyboard recorded questionnaires, 

as in earlier experiments.) 

b. Analysis 

From the recorded times listed above, flow rates of data base queries, 
as a function of "game time" can be determined, and similarly for mes- 
sage rates. An analysis of factors influencing these flow rates will 
be made, including 

1) Whether TA's significantly increase flow rates 

2) Whether game intensity is a good predictor of the flow rates 

Comparison of the abilities of teams with and without TA's to respond 
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to intense situations will be made through the flow rates and, more 
importantly, through subjective evaluations of players and the umpire, 
c. Anticipated Results 

There will be very large variation in team performance from game to 
game and team to team. These sources of noise may well mask the 
statistical significance of any difference due to presence or absence 
of TA's. Some useful insights into reponses of teams to game situations, 
including game intensity and message and query flow rates should emerge. 
Subjective opinions about the usefulness of the TA's, together with the 
record of their actual usage (when available) should provide useful 
information concerning this man-machine interface technology. 

6. COMMENTS AND SPECIAL INSTRUCTIONS 

Time data should be recorded both in terms of real (clock) time as well 
as "point in the game" time, so reactions of the teams can be associated 
with their stimulus, if possible. The teams need only be present in pairs, 
to play the WES games. Thus, there is flexibility in scheduling in that 
the second pair of teams (Team 3 and Team 4) might play their WES games at 
a much later time than do teams 1 and 2. The players used should be re- 
presentative of the populations of decision makers and operators encountered 
in the fleet. They should not previously have played the scenarios used 
in this experiment. 

Questionnaires much like those presented with earlier experiments should 
be presented to the players (decision makers, operators, umpire), probably 
via the display/terminal approach suggested before. 

All data recorded for a given play of the WES game should be collated 
into one file, with necessary identification codes giving the values of 
the factors used in that trial (i.e., player id's, game id, which team 
played blue side, which team had TA's, etc.). The data should be stored 
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on tape readable by the IBM System. At least 30 days analysis time 
is required, before results can be reported. 
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